In this project, I want to tell a data story about happy music songs. Music accompanys with us every day. It lifts us up when we are down, and sometimes move us to tear. I want to know more about happy music songs from the lyrics perspective.

1. What is the Happiest Genre?

Based on lyrics, to investigate the happiest genre, I started with sentiment analysis. I used the given stemmed lyrics since for sentiment analysis, the cleansed stop words do not carry much emtion and are irrelevant from the analysis.

Then consider how to define a song is happy? Each song contains a mixture of different sentiments. In my opinion, when the joy sentiment takes the greatest portion in a song, I considered it a happy song and used this definition afterwards. The sentiments of the stemmed keywords are according to the NRC Lexion in this project.

The following table gives some examples of the identified major emotion of a song added to the data frame for further analyses.

song genre major.emotion
when-you-were-with-me Hip-Hop joy
careless-whisper Hip-Hop sadness
2-59 Hip-Hop anticipation
power-of-desire Hip-Hop fear
you-re-not-in-love Hip-Hop disgust

After sentiment analysis was completed, I made a bar plot, as above, to see how the songs of different genres distributed among the eight categories of emotions. From the plot, I noticed that joy is the most popular emotion in nearly all genres, except for Hip-Hop and Metal. (Hip-Hop mainly focuses on anger and Metal focuses on fear.) And noticeably, Jazz is the happiest genre based on the given corpus of songs. 43.3% of Jazz songs express majorly joy emotion.

Then I want to investigate more and compare the top 2 happiest genre: Jazz and R&B, according to the plot above.

2. Focused on the top 2 happiest genres and their “joy” songs:

- Are they similar or different?

2.1 Overview of the most frequent word

To get an overall understanding of the “joy” emotion songs of Jazz and R&B genre, I first take a look at the most frequently used words through wordclouds.

For “joy” Jazz and R&B genres:

For Jazz:

For R&B:

From the wordcloud plots, I found that Jazz and R&B used similar key stemmed words to express joy. No matter I look from a single genre alone or two genres combined, love, baby, heart, etc all appear to be among the most frequent words in all the wordcloud plots above. This finding motivates me think about how similar their song topics are under the joy category. Are these joy songs not only have same emotions, but also use similar topics? Are people singing about the same joyful topic for different genres?

With this question in mind, I then continued doing topic modelling to solve my questions.

2.2 Topic modelling

Before doing topic modelling, I converted the stemmed word data from data frame back to Corpus to generate the document-term matrix. Then I adopted the LDA(Latent Dirichlet Allocation) method to allocate topics describing these joy Jazz and R&B songs. The number of topics selected is by trial and error such that meaningful topics can be inferred from the grouped keywords. Finally, I selected the topic number as seven.

After running LDA, based on the result, I manually hashtagged topics according to their corresponding most popular and most salient terms. For example, for the Topic 3, which is of interest later, I labelled it as “Love” since “woman”, “babe”, “dear”, “thrill” etc. are the keywords for this topic.

From the above plot, I can see how Jazz and R&B songs distributed among these summarized topics. The most obvious is that, in percentage, R&B has more happy love songs than Jazz. For other topics, Jazz and R&B have roughly the same portion of songs.

In summary, the most noticeable discovery is: relatively, even if Jazz has more songs with joyful emotion than R&B, yet R&B focuses more on the love topic than Jazz. Despite this moderate difference in love topic, these two genres roughly have the same percentage of happy songs on other topics. This high similarity in terms of topics explained the high similarity in the frequent keywords shown in wordclouds, since vocabularies would not vary much under the same/similar topics.

2.3 Lyrics Manifestation

Besides the keywords and topics of a song, the pattern of lyrics also plays an important role in expressing happiness. In this section, I focused on the repetitiveness of lyrics. And I analysed the difference between Jazz and R&B joy songs from this perspective.

Why is the repetitiveness of lyrics important?
Repetition is very important in music, where melody and lyrics are often repeated. However, the level of the repetitiveness of songs varies a lot. For example, if a song is mostly narrative, then it is less repetitive than non-narrative songs. With high repetitiveness, even though storytelling ability is sacrificed, On the other hand, these repetitive lyrics assit memory and help drive market success.

Then let me investigate how repetitive are joy Jazz and R&B lyrics?

R&B and Jazz in term of happy songs, their lyrics repetition ratios do not differ that much according to the plot. On average, R&B songs has lightly over 47.2% of repetition, and Jazz music has 40.4% of repetition. However, considering stop words were filtered out in advance, the actual repetition ratio should be a lot larger than the current result. In summary, R&B songs are more repetitive and easy to remember as a result of high repetitiveness than Jazz. As a result, R&B songs are easy to become popular. This could be used to support why R&B is far more popular than Jazz in the current music market, as analysed by Statista.

3. Conclusion

  1. Using sentiment analysis, I found the happiest genre is Jazz. Besides, joy is the most popular emotion among all genres, only except for Metal and Hip-Hop.

  2. Compare happy songs between Jazz and R&B: